AITopics | correlation dimension

Collaborating Authors

correlation dimension

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

IntrinsicDimension,PersistentHomologyand GeneralizationinNeuralNetworks SupplementaryMaterial

Neural Information Processing SystemsFeb-8-2026, 05:24:21 GMT

S1 firsts gives some of the formal definitions and interpretations omitted from the main paper due to space limitations.

artificial intelligence, dimension, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

Correlation Dimension of Auto-Regressive Large Language Models

Du, Xin, Tanaka-Ishii, Kumiko

arXiv.org Artificial IntelligenceOct-27-2025

Large language models (LLMs) have achieved remarkable progress in natural language generation, yet they continue to display puzzling behaviors -- such as repetition and incoherence -- even when exhibiting low perplexity. This highlights a key limitation of conventional evaluation metrics, which emphasize local prediction accuracy while overlooking long-range structural complexity. We introduce correlation dimension, a fractal-geometric measure of self-similarity, to quantify the epistemological complexity of text as perceived by a language model. This measure captures the hierarchical recurrence structure of language, bridging local and global properties in a unified framework. Through extensive experiments, we show that correlation dimension (1) reveals three distinct phases during pretraining, (2) reflects context-dependent complexity, (3) indicates a model's tendency toward hallucination, and (4) reliably detects multiple forms of degeneration in generated text. The method is computationally efficient, robust to model quantization (down to 4-bit precision), broadly applicable across autoregressive architectures (e.g., Transformer and Mamba), and provides fresh insight into the generative dynamics of LLMs.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.21258

Country:

North America > United States (0.28)
Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Causal Regularization

Dominik Janzing

Neural Information Processing SystemsOct-2-2025, 08:58:29 GMT

Neural Information Processing Systems http://nips.cc/

regression, scenario 1, scenario 2, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Oregon > Benton County > Corvallis (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)

Add feedback

Tailored minimal reservoir computing: on the bidirectional connection between nonlinearities in the reservoir and in data

Prosperino, Davide, Ma, Haochun, Räth, Christoph

arXiv.org Artificial IntelligenceApr-25-2025

We study how the degree of nonlinearity in the input data affects the optimal design of reservoir computers, focusing on how closely the model's nonlinearity should align with that of the data. By reducing minimal RCs to a single tunable nonlinearity parameter, we explore how the predictive performance varies with the degree of nonlinearity in the reservoir. To provide controlled testbeds, we generalize to the fractional Halvorsen system, a novel chaotic system with fractional exponents. Our experiments reveal that the prediction performance is maximized when the reservoir's nonlinearity matches the nonlinearity present in the data. In cases where multiple nonlinearities are present in the data, we find that the correlation dimension of the predicted signal is reconstructed correctly when the smallest nonlinearity is matched. We use this observation to propose a method for estimating the minimal nonlinearity in unknown time series by sweeping the reservoir exponent and identifying the transition to a successful reconstruction. Applying this method to both synthetic and real-world datasets, including financial time series, we demonstrate its practical viability. Finally, we transfer these insights to classical RC by augmenting traditional architectures with fractional, generalized reservoir states. This yields performance gains, particularly in resource-constrained scenarios such as physical reservoirs, where increasing reservoir size is impractical or economically unviable. Our work provides a principled route toward tailoring RCs to the intrinsic complexity of the systems they aim to model.

artificial intelligence, machine learning, nonlinearity, (16 more...)

arXiv.org Artificial Intelligence

2504.17503

Country: Europe > Germany (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

On consistent estimation of dimension values

Cholaquidis, Alejandro, Cuevas, Antonio, Pateiro-López, Beatriz

arXiv.org Machine LearningDec-18-2024

The problem of estimating, from a random sample of points, the dimension of a compact subset S of the Euclidean space is considered. The emphasis is put on consistency results in the statistical sense. That is, statements of convergence to the true dimension value when the sample size grows to infinity. Among the many available definitions of dimension, we have focused (on the grounds of its statistical tractability) on three notions: the Minkowski dimension, the correlation dimension and the, perhaps less popular, concept of pointwise dimension. We prove the statistical consistency of some natural estimators of these quantities. Our proofs partially rely on the use of an instrumental estimator formulated in terms of the empirical volume function Vn (r), defined as the Lebesgue measure of the set of points whose distance to the sample is at most r. In particular, we explore the case in which the true volume function V (r) of the target set S is a polynomial on some interval starting at zero. An empirical study is also included. Our study aims to provide some theoretical support, and some practical insights, for the problem of deciding whether or not the set S has a dimension smaller than that of the ambient space. This is a major statistical motivation of the dimension studies, in connection with the so-called Manifold Hypothesis.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

2412.13898

Country:

South America > Uruguay (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Indiana (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Controlling dynamical systems into unseen target states using machine learning

Köglmayr, Daniel, Haluszczynski, Alexander, Räth, Christoph

arXiv.org Artificial IntelligenceDec-13-2024

We present a novel, model-free, and data-driven methodology for controlling complex dynamical systems into previously unseen target states, including those with significantly different and complex dynamics. Leveraging a parameter-aware realization of next-generation reservoir computing, our approach accurately predicts system behavior in unobserved parameter regimes, enabling control over transitions to arbitrary target states. Crucially, this includes states with dynamics that differ fundamentally from known regimes, such as shifts from periodic to intermittent or chaotic behavior. The method's parameter-awareness facilitates non-stationary control, ensuring smooth transitions between states. By extending the applicability of machine learning-based control mechanisms to previously inaccessible target dynamics, this methodology opens the door to transformative new applications while maintaining exceptional efficiency. Our results highlight reservoir computing as a powerful alternative to traditional methods for dynamic system control.

artificial intelligence, machine learning, target state, (16 more...)

arXiv.org Artificial Intelligence

2412.10251

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Energy > Renewable (0.69)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Correlation Dimension of Natural Language in a Statistical Manifold

Du, Xin, Tanaka-Ishii, Kumiko

arXiv.org Artificial IntelligenceMay-15-2024

The correlation dimension of natural language is measured by applying the Grassberger-Procaccia algorithm to high-dimensional sequences produced by a large-scale language model. This method, previously studied only in a Euclidean space, is reformulated in a statistical manifold via the Fisher-Rao distance. Language exhibits a multifractal, with global self-similarity and a universal dimension around 6.5, which is smaller than those of simple discrete random sequences and larger than that of a Barab\'asi-Albert process. Long memory is the key to producing self-similarity. Our method is applicable to any probabilistic model of real-world discrete sequences, and we show an application to music data.

correlation dimension, dirichlet distribution, sequence, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1103/PhysRevResearch.6.L022028

2405.06321

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Media > Music (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

What is the $\textit{intrinsic}$ dimension of your binary data? -- and how to compute it quickly

Hanika, Tom, Hille, Tobias

arXiv.org Artificial IntelligenceApr-9-2024

Dimensionality is an important aspect for analyzing and understanding (high-dimensional) data. In their 2006 ICDM paper Tatti et al. answered the question for a (interpretable) dimension of binary data tables by introducing a normalized correlation dimension. In the present work we revisit their results and contrast them with a concept based notion of intrinsic dimension (ID) recently introduced for geometric data sets. To do this, we present a novel approximation for this ID that is based on computing concepts only up to a certain support value. We demonstrate and evaluate our approximation using all available datasets from Tatti et al., which have between 469 and 41271 extrinsic dimensions.

dimension, intrinsic dimension, obsdiam, (14 more...)

arXiv.org Artificial Intelligence

2404.06326

Country:

Europe > Germany (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

Scaling Laws in Jet Classification

Batson, Joshua, Kahn, Yonatan

arXiv.org Artificial IntelligenceDec-4-2023

We demonstrate the emergence of scaling laws in the benchmark top versus QCD jet classification problem in collider physics. Six distinct physically-motivated classifiers exhibit power-law scaling of the binary cross-entropy test loss as a function of training set size, with distinct power law indices. This result highlights the importance of comparing classifiers as a function of dataset size rather than for a fixed training set, as the optimal classifier may change considerably as the dataset is scaled up. We speculate on the interpretation of our results in terms of previous models of scaling laws observed in natural language and image datasets.

classifier, efp, power law, (15 more...)

arXiv.org Artificial Intelligence

2312.02264

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Oakland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Controlling dynamical systems to complex target states using machine learning: next-generation vs. classical reservoir computing

Haluszczynski, Alexander, Köglmayr, Daniel, Räth, Christoph

arXiv.org Artificial IntelligenceJul-14-2023

Controlling nonlinear dynamical systems using machine learning allows to not only drive systems into simple behavior like periodicity but also to more complex arbitrary dynamics. For this, it is crucial that a machine learning system can be trained to reproduce the target dynamics sufficiently well. On the example of forcing a chaotic parametrization of the Lorenz system into intermittent dynamics, we show first that classical reservoir computing excels at this task. In a next step, we compare those results based on different amounts of training data to an alternative setup, where next-generation reservoir computing is used instead. It turns out that while delivering comparable performance for usual amounts of training data, next-generation RC significantly outperforms in situations where only very limited data is available. This opens even further practical control applications in real world problems where data is restricted.

artificial intelligence, machine learning, reservoir computing, (17 more...)

arXiv.org Artificial Intelligence

2307.07195

Country:

Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback